---
title: Monitoring support for generative models
description: The text generation target type for DataRobot custom and external models is compatible with generative Large Language Models (LLMs), allowing you to deploy generative models, make predictions, monitor model performance statistics, export data, and create custom metrics.
section_name: MLOps
maturity: premium

---

# Monitoring support for generative models

!!! info "Availability information"
    Monitoring support for generative models is a premium feature. Contact your DataRobot representative or administrator for information on enabling this feature.

    <b>Feature flags:</b> Enable Monitoring Support for Generative Models, [Enable the Injection of Runtime Parameters for Custom Models](pp-cus-model-runtime-params)

Using the text generation target type for custom and external models, a premium LLMOps feature, deploy generative Large Language Models (LLMs) to make predictions, monitor service, usage, and data drift statistics, and create custom metrics. DataRobot supports LLMs through two deployment methods:

* [Create a text generation model as a custom inference model in DataRobot](#create-and-deploy-a-generative-custom-inference-model): Create and deploy a text generation model using DataRobot's Custom Model Workshop, calling the LLM's API to generate text instead of performing inference directly and allowing DataRobot MLOps to access the LLM's input and output for monitoring. To call the LLM's API, you should [enable public network access for custom models](custom-model-resource-mgmt).

* [Monitor a text generation model running externally](#create-and-deploy-an-external-generative-model): Create and deploy a text generation model on your infrastructure (local or cloud), using the monitoring agent to communicate the input and output of your LLM to DataRobot for monitoring.

## Create and deploy a generative custom model {: #create-and-deploy-a-generative-custom-model }

Custom inference models are user-created, pretrained models that you can upload to DataRobot (as a collection of files) via the [Custom Model Workshop](custom-model-workshop/index). You can then upload a model artifact to create, test, and deploy custom inference models to DataRobot's centralized deployment hub.

### Add a generative custom model {: #add-a-generative-custom-model }

To add a generative model to the Custom Model Workshop:

1. Click **Model Registry > Custom Model Workshop** and, on the **Models** tab, click **+ Add new model**.

    ![](images/cmodel-1.png)

2. In the **Add Custom Inference Model** dialog box, under **Target type**, click **Text Generation**.

    ![](images/text-generation-custom-model.png)

3. Enter a **Model name** and **Target name**. In addition, you can click **Show Optional Fields** to define the language used to build the model and provide a description.

5. Click **Add Custom Model**. The new custom model opens to the **Assemble** tab.

### Assemble and deploy a generative custom model {: #assemble-and-deploy-a-generative-custom-model }

To assemble, test, and deploy a generative model from the Custom Model Workshop:

1. On the right side of the **Assemble** tab, under **Model Environment**, select a model environment from the **Base Environment** list. The model environment is used for [testing](custom-model-test) and [deploying](deploy-custom-inf-model) the custom model.

    ![](images/cmodel-assemble-add-env.png)

    !!! note
        The **Base Environment** pulldown menu includes [drop-in model environments](drop-in-environments), if any exist, as well as [custom environments](custom-environments#create-a-custom-environment) that you can create.

2. On the left side of the **Assemble** tab, under **Model**, drag and drop files or click **Browse local files** to upload your LLM's custom model artifacts. Alternatively, you can import model files from a [remote repository](custom-model-repos).

    ![](images/cmodel-assemble-add-files.png)

    !!! important
        If you click **Browse local files**, you have the option of adding a **Local Folder**. The local folder should contain dependent files and additional assets required by your model, not the model itself. If the model file is included in the folder, it will not be accessible to DataRobot. Instead, the model file must exist at the root level. The root file can then point to the dependencies in the folder.
    
    A basic LLM assembled in the Custom Model Workshop should include the following files:

    File                  | Contents
    ----------------------|-------------
    `custom.py`           | The [custom model code](structured-custom-models), calling the LLM service's API through [public network access for custom models](custom-model-resource-mgmt).
    `model-metadata.yaml` | The [runtime parameters](pp-cus-model-runtime-params) required by the generative model.
    `requirements.txt`    | The [libraries (and versions)](custom-model-dependencies) required by the generative model.

    The dependencies from `requirements.txt` appear under **Model Environment** in the **Model Dependencies** box.

3. After you add the required model files, [add training data](custom-model-training-data). To provide a training baseline for drift monitoring, you should upload a dataset containing _at least_ 20 rows of prompts and responses relevant to the topic your generative model is intended to answer questions about. These prompts and responses can be taken from documentation, manually created, or generated.

4. Next, click the **Test** tab, click **+ New test**, and then click **Start test** to run the **Startup** and **Prediction error** tests, the only tests supported for the **Text Generation** target type.

5. Click **Register to deploy**, [provide the model information](custom-model-reg), and click **Add to registry**.

    The model opens on the **Registered Models** tab.

6. In the registered model version header, click **Deploy**, and then [configure the deployment settings](add-deploy-info).

    You can now [make predictions](predictions/index) as you would with any other DataRobot model.

## Create and deploy an external generative model {: #create-and-deploy-an-external-generative-model }

External model packages allow you to register and deploy external generative models. You can use the [monitoring agent](mlops-agent/index) to access MLOps monitoring capabilities with these model types.

To create and deploy a model package for an external generative model:

1. Click **Model Registry** and on the **Registered Models** tab, click **Add new package** and select **New external model package**.

    ![](images/reg-ext-model.png)

2. In the **Register new external model** dialog box, from the **Prediction type** list, click **Text generation** and [add the required information](ext-model-reg) about the agent-monitored generative model. To provide a training baseline for drift monitoring, in the **Training data** field, you should upload a dataset containing _at least_ 20 rows of prompts and responses relevant to the topic your generative model is intended to answer questions about. These prompts and responses can be taken from documentation, manually created, or generated.

    ![](images/text-generation-external-model.png)

3. After you define all fields for the model package, click **Register**. The package is registered in the **Model Registry** and is available for use.

4. From the **Model Registry > Registered Models** tab, [locate and deploy the generative model](reg-deploy). 

5. Add [deployment information and complete the deployment](add-deploy-info).

## Monitor a deployed generative model {: #monitor-a-deployed-generative-model }

To monitor a generative model in production, you can view [service health](service-health) and [usage](deploy-usage) statistics, export [deployment data](data-export), create [custom metrics](custom-metrics), and identify [data drift](data-drift).

=== "Service Health"

    ![](images/text-generation-service-health.png)

=== "Usage"

    ![](images/text-generation-usage.png)

=== "Data Export"

    ![](images/text-generation-data-export.png)

=== "Custom Metrics"

    ![](images/text-generation-custom-metrics.png)

=== "Data Drift"

    ![](images/text-generation-data-drift.png)

### Data drift for generative models {: #feature-details-for-generative-models }

To monitor drift in a generative model's prediction data, DataRobot compares new prompts and responses to the prompts and responses in the training data you uploaded during model creation. To provide an adequate training baseline for comparison, the uploaded training dataset should contain _at least_ 20 rows of prompts and responses relevant to the topic your model is intended to answer questions about. These prompts and responses can be taken from documentation, manually created, or generated.

On the **Data Drift** tab for a generative model, you can view the [**Feature Drift vs. Feature Importance**](data-drift#feature-drift-vs-feature-importance-chart), [**Feature Details**](#feature-details-for-generative-models), and [**Drift Over Time**](data-drift#drift-over-time-chart) charts: 

![](images/text-generation-data-drift.png)

To learn how to adjust the **Data Drift** dashboard to focus on the model, time period, or feature you're interested in, see the [Configure the Data Drift dashboard](data-drift#configure-the-data-drift-dashboard) documentation.

The **Feature Details** chart includes new functionality for text generation models, providing a word cloud visualizing differences in the data distribution for each token in the dataset between the training and scoring periods. By default, the **Feature Details** chart includes information about the _question_ (or prompt) and _answer_ (or model completion/output):

![](images/text-generation-feature-details-select.png)

Feature  | Description
---------|-------------
question | A word cloud visualizing the difference in data distribution for each _user prompt_ token between the training and scoring periods and revealing how much each token contributes to data drift in the user prompt data.
answer   | A word cloud visualizing the difference in data distribution for each _model output_ token between the training and scoring periods and revealing how much each token contributes to data drift in the model output data.

!!! note
    The feature names for the generative model's input and output depend on the feature names in your model's data; therefore, the _question_ and _answer_ features in the example above will be replaced by the names of the input and output columns in your model's data.

You can also designate other features for data drift tracking; for example, you could decide to track the model's _temperature_, monitoring the level of creativity in the generative model's responses from high creativity (1) to low (0).

To interpret the feature drift word cloud for a text feature like _question_ or _answer_, hover over a user prompt or model output token to view the following details:

![](images/text-generation-feature-details.png)

Chart element      | Description
-------------------|------------
Token              | The tokenized text represented by the word in the word cloud. Text size represents the token's drift contribution and text color represents the dataset prevalence. Stop words are hidden from this chart.
Drift contribution | How much this particular token contributes to the feature's drift value, as reported in the **Feature Drift vs. Feature Importance** chart.
Data distribution  | How much more often this particular token appears in the training data or the predictions data. <ul><li><span style="color: blue">Blue</span>: This token appears `X`% more often in training data.</li><li><span style="color: red">Red</span>: This token appears`X`% more often in predictions data.</li></ul>

!!! tip
    When your pointer is over the word cloud, you can scroll up to zoom in and view the text of smaller tokens.
